Make partial aggregation adaptive#11011
Merged
Merged
Conversation
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This is an optimization for the
HashAggregationOperatorthat is split intopartialandfinalsteps.In case when partial aggregation step does not reduce the number of rows too much (e.g. 90 % of rows are unique) this step brings a small benefit in terms of network savings but costs a lot of CPU to do.
In this case, it would be beneficial to skip partial aggregation altogether at the planning time,
but given we don't always have reliable statistics for the number of unique values, especially in the intermediate query stages it is not easy to do.
Instead (although it's complementary to the planner changes) this adds simple runtime adaptation for the
partial aggregation step, that sends raw, ungrouped rows to the final step if the ratio of unique to input rows is big enough (0.8 by default).With this change, there is a still significant overhead on the partial step mainly in the
PartitionedOutputOperatorthat has to handle the superfluous accumulator state for the raw rows + in theHashAggregationOperatorthat needs to create and populate this state.There are potential improvements for this in both
HashAggregationOperatorandPartitionedOutputOperatorthat would limit the overhead.Another possible approach is to have a separate pipe (as it has a different layout) from
partialtofinalstep with only the input pages without the accumulator state. This would eliminate almost all of the overhead but require larger changes in the core engine.tpch/tpcds benchmark results for orc sf1000
part
overall ~6% TPCH and 1.5 % tpcds improvement. Most queries are not affected, some gain between 10 to 35%
adaptive-pa-part-nocode.pdf
uppart
overall 3.5% for tpch and 2.5% for tpcds
adaptive-pa-unpart-nocode.pdf
General information
performance improvement
core query engine (
HashAggregationOperator)Improves
group byperformance by skipping partial aggregation stepRelated issues, pull requests, and links
Documentation
( x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.
Release notes
( ) No release notes entries required.
( x) Release notes entries required with the following suggested text:
Improve performance of
GROUP BYwith a large number of groups.